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On  the  Relationship  Between  the  Discrete  Fourier 
Transform  and  Maximum  Likelihood  Estimation 


1.  INTRODUCTION 


It  is  commonly  known  that  for  the  estimation  of  the  frequency  of  a  single  sinusoidal 
source  in  white  Gaussian  noise  the  method  using  a  discrete  Fourier  transform  (DFT)  and 
periodogram  is  equivalent  to  maximum  likelihood  estimation.  This  method  can  also  be  used 
to  find  the  ML  estimate  for  a  direction-of-arrival  (DOA)  corresponding  to  a  single  source  in 
an  array  processing  scenario.  For  multiple  DOAs  this  method  does  not  in  general  produce 
ML  estimates.  It  will  be  demonstrated,  however,  that  the  DFT/periodogram  estimation 
method  can  be  generalized  to  provide  maximum  likelihood  estimates  for  multiple  sources. 

This  memorandum  is  organized  as  follows.  In  the  second  section  the  maximum  likeli¬ 
hood  equation  and  corresponding  least-squares  problem  are  given.  In  section  3  it  is  shown 
that  the  standard  DFT/periodogram  method  for  the  ML  estimation  of  a  single  DOA  can 
be  generalized  to  yield  maximum  likelihood  estimates  for  multiple  DOAs.  A  fast  maximum 
likelihood  estimation  algorithm  (FMLE)  is  discussed  in  section  4,  and  conclusions  are  given 
in  section  5. 

The  DOA  problem  treated  here  arises  from  M  narrowband  farfield  signals  impinging 
on  a  linear  array  of  N  equally  spaced  sensors.  The  sensor  outputs  are  subject  to  additive, 
zero-mean,  white  Gaussian  noise.  The  equation  describing  this  scenario  is 

y(k)  =  D(k)p(k)  +  p(k),  k  =  1,2,..., 

where  j/(Ar),  called  a  data  snapshot,  is  an  N-vector  of  array  outputs  at  sampling  time  k. 
The  columns  of  D{k )  =  [di(Ar)  d-zik)  -  •  •  d\f(k)\  are  steering  vectors  at  time  k ,  where 

dt  =  [1  exp(jui)  -  •  •  exp(jui(N-  1))]T,  i  =  l,...,Af,  (1) 

and  Ui  =f  27r/?cos (0,).  Here,  6{  is  the  ith  bearing  and  ft  is  the  interelement  spacing  of  the 
array  in  wavelengths.  The  elements  of  the  signal  vector  p(k)  are  complex  exponentials: 

Pi(k)  =  Ci(k)exp(j(fi(k)),  z  =  l,---,M,  k  =  1,2,..., 
representing  real  signal  amplitudes  c,-  and  phases  p%.  The  noise  vector  at  time  k  is  p(k). 


2.  DEVELOPMENT  OF  THE  LEAST-SQUARES  PROBLEM 


An  A— vector  of  the  form: 


1  zz  ■■■  z 


N- 1 


,  2  =  e?e,  j  =f  V^T, 


1 


will  be  called  a  Fourier  vector.  When  Fourier  vectors  are  mentioned  in  the  context  of  array 
processing  as  in  (1),  they  will  also  be  referred  to  as  steering  vectors. 

Treating  the  array  processing  problem  in  the  frequency  domain,  consider  a  data  snapshot 
represented  by  an  N-vector  consisting  of  samples  from  the  elements  of  a  linear  array,  where 
the  data  yn  from  the  nth  element  consist  of  a  weighted  sum  of  M  complex  exponentials 
corresponding  to  M  sources: 

M 

Vn  =  YLf>i  exp (jujin)  +  rj(n),  n  =  1,  •  •  • ,  N, 

i=i 

where  p(n)  is  the  nth  element  of  the  noise  vector,  p.  The  maximum  likelihood  estimates 
for  pi  and  ul  involve  the  minimization  of  the  following  least-squares  error:1 

N  M  2 

E  =  Y1  Vn-YuPi  eXP  ti&n)  ,  (2) 

n=  1  t'=l 

where  the  p'{s  and  u[s  are  estimates.  Let  A  ~  [j/i  •  •  •  Vn]T  and  D  =  [d±  ■■■  d\f].  writh 
p  —  [p\  •  •  ■  Pm]T ■  The  least-squares  problem  in  (2)  when  generalized  for  r  data  snapshots 
can  be  written  in  matrix/ vector  form  and  stated  as  follows: 

The  M-Source  Problem  Let  A  be  a  complex  N  x  r  matrix.  Find  an  N  x  M  matrix 
D  with  distinct  Fourier  vectors  as  columns  and  an  M  x  r  matrix  p  such  that  the  following 
holds: 

{D,  p}  =  arg  min  \\A-Dp\\2F.  (3) 

D,p 

The  matrix  norm  \\A\\2F  =  J2i,j  \aij\2  where  ||T||/r  denotes  the  Frobenius  norm  of  the  matrix 
A.  Here,  arg  minD^(-)  denotes  the  values  of  the  matrices  D  and  p,  which  minimize  the  given 
expression  in  parentheses,  and  similarly  for  arg  maxj5)()(-). 

Using  the  projection  theorem2  the  error  vector  A  —  Dp  is  orthogonal  to  the  column 
space  in  which  the  solution  lies.  Therefore,  the  orthogonal  projection  of  the  error  onto  the 
column  space  of  D  will  be  zero.  If  the  columns  of  D  are  assumed  to  be  distinct,  then  the 
orthogonal  projection  matrix  can  be  written  as 

and  thus 

D{DHD)~lDH  (. A  -  Dp)  =  0. 

This  can  then  be  solved  for  p  to  obtain 

p  =  (Dh  D)-1  Dh  A,  (4) 

and  it  follows  that 

argmin  \\A  —  Dp\\2F  =  arg  max  trace  (^AH  D(DH  D)~x  DH  .  (5) 


2 


The  formulation  of  the  maximum  likelihood  problem  in  (5)  can  be  found  in  Kay.3 
The  remainder  of  this  memorandum  presents  a  method  for  improving  the  efficiency  of  the 
numerical  calculation  of  (5). 


3.  EXTENSION  OF  THE  DFT/PERIODOGRAM  METHOD  TO  ML 
ESTIMATION  OF  MULTIPLE  DOAS 


In  general  the  matrix  A  will  have  r  columns,  each  corresponding  to  a  data  snapshot 
and  the  steering  vector  matrix  D  will  have  M  columns  representing  an  M-source  model. 
A  specific  case  will  now  be  considered,  where  r  =  2  and  M  =  2.  Consider  the  following 
two  source,  two  data  snapshot  problem: 

ipjn  ||A  —  Dijp\\p,  (6) 


where  A  is  N  x  2,  Dtj  is  N  x  2,  and  p  is  2  x  2,  and  where  DtJ  and  p  are  to  be  determined. 
Let  Tjm  denote  the  DFT  matrix  of  size  N ,  scaled  so  that  JFjv  is  unitary:  =  /jv-  Let 

the  columns  of  Dtj  be  general  Fourier  vectors  and  let  the  columns  of  DZJ  be  Fourier  vectors 
that  are  also  columns  of  an  inverse  DFT  matrix.  (The  fact  that  a  vector  /  is  a  Fourier 
vector  doesn’t  imply  that  it  is  a  column  of  an  inverse  DFT  matrix.)  Let  e,  be  the  unit 
vector  with  a  'V  in  the  ith  position  and  let  =f  [e;  ej\. 


Since  the  norm  ||  •  ||f  is  unitarily  invariant4  and  Tn  is  a  unitary  matrix  (6)  is  equivalent 


to 


min  \\TnA  -  TNDijp ||p. 


hj,p 


Let  C  =f  TnA.  Since  Dzj  =  £;j,  (7)  is  equivalent  to 

mm\\C  -  SijpWp. 


(7) 


(8) 


Here, 


£ijp  — 


(  0  0  \ 

P\1  P\2 

Pl\  P22 


row  1 


row  j 


Let 


V  0 

0 

J 

def  ( 

Cu 

Ci2 

=  l 

Cn 

Cj  2 

3 


(9) 


Given  i  and  j,  (8)  is  minimized  by  letting  p  =  C,y ,  and  so 

nain||G  -  £ijp\\2F  =  \\C\\2F  -  max||C,y||^. 

hJ 

This  solution  is  in  general  of  limited  value  since  the  search  is  performed  over  only  a  fixed 
number  N  of  Fourier  vectors,  which  are  restricted  to  be  columns  of  .  A  finer  angular 
increment  than  2 tt/N  is  often  necessary.  To  this  end,  a  generalization  of  (6)  will  now 
be  developed.  Let  the  matrix  F  be  the  first  N  rows  of  T£x ,  with  L  >  N.  An  angular 
increment  2n  /  L  between  sucessive  Fourier  N-vectors  can  be  obtained  by  choosing  them  to 
be  the  columns  of  F.  The  minimization  problem  is  now 

mm\\A-  Dijp\\2F.  (10) 

h3,P 


This  is  similar  to  (6)  except  that  here  Dij  has  non-orthonormal  Fourier  N-vectors  as 
columns.  Consider  the  expression 


(11) 


where  the  columns  of  the  matrix  A  have  been  zero-padded  to  length  L.  and  D ,-y  has  been 
extended  to  Dij,  where  Dij  is  now  columns  i  and  j  of  the  matrix  J-£l.  The  expression  in 
(11)  can  be  rewritten  as 


l(o)-Hl 

=  \\A-DijP&  -  \\DiiP\?F+\\t)ijp\\2F 

=  \\A  -  DiiP\ \l  -  \\Dijp\\2F  + 

=  ||A  —  DijpW'p  —  ||T)^p|||’  +  ||p||f- 


(12) 


Let  B  be  defined  as 


and  Bij  as 


B  =  Fl 


B  =f 


Ba  Bi2  \ 

Bji  Bj2  J 


An  alternative  expression  for  that  in  (11)  is 


=  \\B-£iA2F 

=  \\B\\2F-\\Bij\\lA\\Bxj-p\\l 

=  -  trace  (Bfjp)  -  trace  ( pHB{j )  +  ||p||^. 


(13) 
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Equating  (12)  and  (13): 

\\A  -  DupWf  =  PIIf  +  \\Dhp\\f  -  trace  (B*p)  -  trace  ( pHBij )  (14) 

=  ||5||^+  trace  (pH D^Dijp)  -  trace  (B%p)  —  trace  ( pH Bij ). 

Given  i  and  j,  the  value  of  p  needed  to  minimize  the  expression  in  (14)  can  be  obtained 
by  differentiation.  To  this  end,  the  following  can  be  shown  to  hold  whenever  X  and  R  are 
real  matrices  with  appropriate  dimensions:5 

•  jjr  trace  (RX)  =  RT. 

•  trace  (XT R)  =  R. 

Noting  that  df(z)/dz  —  0  6  whenever  f(z)  is  an  analytic  function  of  a  complex  variable 
z,  it  follows  that  d(XH)/dX  =  0  for  matrices  X  and  XH  whose  elements  are  independent 
complex  variables.  (The  two  variables  z  and  z  are  considered  to  be  independent  since 
dz/dz  =  0.)  The  complex  versions  of  the  above  formulas  follow,  -where  now  R  is  a  constant 
complex  matrix  and  the  elements  of  the  matrices  X  and  XH  are  independent  complex 
variables: 

•  gY  trace  ( RX )  =  RT. 

•  qy  trace  ( XHR )  =  0. 

Differentiating  (14)  with  respect  to  p  yields  the  following  expression: 

DfjDijP  ~  &ij- 

Conjugating,  setting  this  expression  equal  to  zero,  and  assuming  that  the  columns  of  DtJ 
are  distinct,  one  obtains 

p  =  (D"jDii)-lSij.  (15) 

Substituting  this  into  (14)  gives  the  following  result: 

arg  min  \\A  —  Dijp\\p  =  arg max  trace  (BfXDf-  Dij)~1Bij]  .  (16) 

*d,p  *  d  \  j  j  / 

Let  there  be  M  DO  As  to  be  estimated  and  assume  that  A  has  r  columns.  Define 

Da  =f  dlr]T,  a  =f  ir. 

Then,  (16)  generalizes  to 

arg  min  ||  A  -  Dap\\2F  =  arg  max  trace  (jB^  Da)'1 5a)  ,  (17) 

where  Ba  is  an  r  x  p  matrix  of  DFT  coefficients.  Equation  (17)  represents  the  major  result 
of  this  paper.  Comparing  (17)  with  (5)  it  can  be  seen  that  the  columns  of  the  matrix  of 
Fourier  vectors  D  can  be  chosen  in  such  a  way  that  the  matrix  AH D  in  equation  (5)  can 
be  replaced  with  the  matrix  Ba  in  (17)  whose  elements  consist  of  DFT  coefficients.  Thus, 
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with  a  proper  choice  of  D  all  the  inner  products  comprising  the  elements  of  AH D  can  been 
computed  by  calculating  an  FFT  of  each  of  the  r  columns  of  A. 

When  there  are  two  sources  and  A  is  a  single  column  vector,  BtJ  is  a  2  x  1  matrix 
containing  the  ith  and  jth  DFT  coefficients  of  B.  If  A  does  not  need  to  be  zero-padded  in 
this  case  in  order  to  obtain  maximum  likelihood  resolution,  then  L  =  N  and  Dij  =  Dij  so 
that  has  orthonormal  columns  and 


argmin||A  —  Dijp\\2F  =  argmax  trace  (B^Bij)  .  (18) 

These  are  the  values  for  i  and  j  corresponding  to  the  two  largest  magnitude  values  in 
the  periodogram  corresponding  to  the  DFT  of  A.  These  can  be  found  from  a  single  one¬ 
dimensional  search  and  the  joint  estimation  of  DOAs  in  this  case  is  unnecessary.  When 
one  DOA  is  to  be  estimated  from  a  single  data  snapshot,  (18)  reduces  to 

arg  min  || A  —  Dip\\F  =  arg  max  (BfBi)  . 

As  is  well  known,  this  selects  the  bearing  corresponding  to  the  maximum  value  of  the 
periodogram,  i.e.,  the  DFT  coefficient  of  largest  magnitude,  as  the  ML  estimate  for  the 
bearing. 


4.  ALGORITHM  IMPLEMENTATION 


To  implement  the  FMLE  algorithm  when  there  are  two  sources  and  a  single  data  snap¬ 
shot,  the  following  search  needs  to  be  implemented  efficiently: 

argmax  trace  ^£(0? Dij)~1Bij')  .  (19) 

First  it  is  shown  that  the  elements  of  the  matrix  (D1^  Dij)-1  can  be  computed  in  closed 
form.  Consider  two  Fourier  N-vectors  di  and  dj,  where 

di  =f  [l exp(27r(\/—  l/N)i)  exp(27r(\/— l/IV)(iV  —  l)i)J  , 

and  similarly  for  dj.  Observing  that  dfdj  is  the  sum  of  the  first  N  terms  in  a  geometric 
series  the  following  is  obtained: 

def  jH  j  _  1  -  exp(— (2ttv A^T/N)(i  -  j)N ) 

2  J  1  -  exp(-(2-7rv/=T /N)(i-j)) 

Given  i  and  j,  the  matrix  Dij  can  then  be  written  as 


DfjDij  = 


7*j  N  ) 
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p 


and  the  inverse  as 

dhd  )~1  = _ _ _ (  ^ 

u  v)  N2~h>\2  [-ia 

Letting  Bij  =f  [bt  bj]T ,  (19)  can  then  be  expressed  as 


arg  max 

ij 


N 

N*  ~  I2 


(w2  +  n2 


(2/JV)R(6if>j7«)))- 


Writing  bi  =  and  7^  =  this  can  then  be  implemented  in  real  arithmetic 


as 


arg  max 


N 

N 2  -  R% 


(2/N')rlrJRZJ  cos (— 0,-  +  Oj  +  4>i:)\ 


(20) 


Equation  (20)  is  of  theoretical  interest  because  it  shows  how  to  modify  the  value  r2  +  rj  of 
the  point  (i,j)  in  the  two-dimensional  periodogram  based  only  on  a  single  DFT  of  the  data 
snapshot,  in  order  to  obtain  a  “normalized”  periodogram  whose  maximum  corresponds 
to  the  ML  estimate.  MATLAB  code  implementing  (20)  is  given  in  the  appendix.  Note 
that  the  entries  in  depend  only  on  the  difference  i  —  j  of  the  indices.  This 

is  advantageous  because,  as  can  be  seen  in  the  appendix,  the  main  loop  can  be  over  the 
difference  i—j.  Since  i—j  is  then  constant  for  each  iteration,  the  inverse  matrix  (D^  D7)-1 
is  also  constant  and  this  allows  the  inner  loop  to  be  vectorized,  with  a  resulting  increase  in 
efficiency,  especially  for  small  values  of  i  —  j. 


The  following  example  is  intended  to  illustrate  the  use  of  the  FMLE  method  to  pro¬ 
vide  increased  resolution  in  a  bearing  estimation  scenario  where  Cramer-Rao  (CR)  bound 
performance  is  not  a  necessity.  Assuming  that  there  is  an  N  element  array  and  that  a 
zero-padded  FFT  of  T  >  N  points  has  been  taken,  the  cost  is  approximately  ST2  flops, 
or  floating-point  operations.  This  is  so  because  the  preprocessing  takes  0(Tlog(T))  flops 
for  the  FFT  and  0(T)  flops  to  precompute  the  magnitude,  squared  magnitude,  and  an¬ 
gular  argument  of  each  DFT  coefficient.  To  implement  beamforming,  each  beam  takes  N 
complex  additions  and  multiplications,  for  a  total  of  8 NT  flops.  The  complexities  of  the 
two  methods  are  then  roughly  equal  when  the  number  of  beams  is  equal  to  the  number  of 
array  elements.  As  T/N  increases,  the  efficiency  of  the  FMLE  algorithm  with  respect  to 
beamforming  therefore  decreases. 

To  improve  the  efficiency  of  the  FMLE  algorithm,  first  perform  a  coarse  2-D  search  to 
find  two  candidate  DOAs  pi  and  p 2.  If  these  are  separated  by  more  than  1/AT,  then  ML 
estimates  can  be  obtained  by  performing  two  fine  1-D  searches  over  intervals  [<i\  a^\  and 
[hi  62]  (ai  <  61)  with  each  interval  centered  on  a  candidate  DOA.  If  p\  and  P2  are  not 
sufficiently  separated  (a2  +  1  >  61)  then  perform  a  single  fine  2-D  search  over  the  interval 
[a\  62].  The  two  signal  amplitudes  and  phases  can  be  calculated  according  to  (15),  which 
gives  maximum  likelihood  estimates:3 

Pi,  *  (DfjDaY'Bij. 

(Compare  this  with  (4).)  Since  these  represent  signal  amplitudes  and  phases,  the  FMLE 
algorithm  can  be  used  in  place  of  beamforming  in  situations  where  signal  powers  are  re¬ 
quired.  With  T  =  2 N,  the  coarse  search  can  be  performed  over  N/2  points  of  a  2N-point 
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zero-padded  DFT  and  the  fine  1-D  or  2-D  searches  using  all  2N  points.  The  complexity 
for  this  implementation  of  the  method  assuming  one  coarse  and  one  fine  2-D  search  is 
then  8{N / 2)2  +  8(a2N)2  =  8(1/4  +  4a2)Ar2,  where  a  is  the  fraction  of  the  total  number 
2 N  of  frequency  bins  over  which  the  fine  search  is  made.  For  a  search  over  16  bins  with 
N  =  64,  a  =  (16/128)  and  the  total  flop  count  is  8(5/16)Ar2.  For  N  beams  the  beam¬ 
forming  complexity  is  8NT  =  16iV2.  Taking  the  preprocessing  into  account,  the  FMLE 
algorithm  is  about  a  factor  of  4  faster  than  beamforming.  In  Figures  1  and  2,  complexities 
for  32-element  and  48-element  linear  arrays  are  given.  In  these  examples  efficiency  has 
been  traded  for  increased  resolution,  and  the  FMLE  method  as  implemented  here  searches 
over  twice  as  many  DOAs  as  there  are  beams,  with  a  resulting  decrease  in  efficiency  and 
increase  in  resolution. 


5.  CONCLUSIONS 


It  has  been  shown  that  the  DFT/periodogram  method  used  to  find  the  maximum  likeli¬ 
hood  estimate  of  a  single  DOA  can  be  extended  to  obtain  maximum  likelihood  estimates  for 
multiple  DOAs.  The  generalized  DFT /periodogram  method  is  summarized  in  the  following: 

argmax  trace  (Bn  (D^ D a)-1  B .  (21) 

For  the  case  of  two  sources  and  one  data  snapshot,  the  complexity  of  the  FMLE  algorithm 
is  comparable  to  that  of  beamforming,  with  less  bias  for  closely  spaced  sources.  The 
fundamental  reason  for  the  increase  in  performance  of  the  FMLE  algorithm,  compared 
with  the  standard  ML  search  method,  can  be  seen  by  comparing  equations  (5)  and  (21). 
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APPENDIX 


*/. 


I,  del  is  the  separation  in  DFT  bins.  Process  all  pairs  of  bins  with 
*1  an  equal  separation  at  the  same  time. 

/  An  L-element  FFT  here, 
valmax  =  -lelO; 
for  del  =  1:L-1 

i  inner  loop  has  been  vectorized  in  the  following  statements. 
phil=phi (1 :L-del) ;  l  DFT  coeff;  angle. 

phi2=phi (1+del :L) ; 

G1  =  g(l:L-del) 

G2  =  g(l+del:L) 

R1  =  r(l:L-del) 

R2  =  r(l+del:L) 

•/. 

I  R:  magnitude  of  gamma. 

•/.  T :  argument  of  gamma . 

*/.  F:  the  factor  in:  F*inv(  D’*D) 

dthetax  =  phi2-phil+T(del) ; 

Rh  =  (R(del)*2)*cos (dthetax) ; 

GG  =  Gl+G2-(Rl.*R2.*Rh) ; 

% 


V,  DFT  coeff;  magnitude  squared. 
DFT  coeff;  magnitude. 


'/,  2  flops  per  point . 
I  2  flops  per  point. 
X  4  flops  per  point. 


[val, index]  =  maz(GG); 
val  =  F(del)*val; 
if (val>valmax) 
valmajc  =  val; 
iisav  =  index; 
jjsav  =  index+del; 
delsav  =  del; 
end; 
end; 
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LOG10(#  FLOPS) 


1.5  1.6  1.7  1.8  1.9  2  2.1 

LOGICK#  SEARCH  VECTORS  =  #  BEAMS 

Figure  1.  Comparison  of  3  Bearing  Estimators,  32-Element  Array 
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LOG10(#  FLOPS) 


5! _ 1 _ i _ i _ 1 _ i _ i _ I 

1.8  1.85  1.9  1.95  2  2.05  2.1  2.15 

LOG10(#  SEARCH  VECTORS) 

Figure  2.  Comparison  of  3  Bearing  Estimators,  48-Element  Array 
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